NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Contextual Restless Multi-Armed Bandits with Application to Demand Response Decision-Making

https://doi.org/10.1109/CDC56724.2024.10886713

Chen, Xin; Hou, I-Hong (December 2024, IEEE)

This paper introduces a novel multi-armed bandits framework, termed Contextual Restless Bandits (CRB), for complex online decision-making. This CRB framework incorporates the core features of contextual bandits and restless bandits, so that it can model both the internal state transitions of each arm and the influence of external global environmental contexts. Using the dual decomposition method, we develop a scalable index policy algorithm for solving the CRB problem, and theoretically analyze the asymptotical optimality of this algorithm. In the case when the arm models are unknown, we further propose a model-based online learning algorithm based on the index policy to learn the arm models and make decisions simultaneously. Furthermore, we apply the proposed CRB framework and the index policy algorithm specifically to the demand response decision-making problem in smart grids. The numerical simulations demonstrate the performance and efficiency of our proposed CRB approaches.
more » « less
Full Text Available
Second-Order Analysis of CSMA Protocols for Age-of-Information Minimization

https://doi.org/10.1109/IEEECONF60004.2024.10943050

Fan, Siqi; Hou, I-Hong (October 2024, IEEE)

This paper introduces a general framework to analyze and optimize age-of-information (AoI) in CSMA protocols for distributed uplink transmissions. The proposed framework combines two theoretical approaches. First, it employs second-order analysis that characterizes all random processes by their respective means and temporal variances and approximates AoI as a function of the mean and temporal variance of the packet delivery process. Second, it employs mean-field approximation to derive the mean and temporal variance of the packet delivery process for one node in the presence of interference from others. To demonstrate the utility of this framework, this paper applies it to the age-threshold ALOHA policy and identifies parameter settings that outperform those previously suggested as optimal in the original work that introduced this policy. Simulation results demonstrate that our framework provides precise AoI approximations and achieves significantly better performance, even in networks with a small number of users.
more » « less
Full Text Available
Distributed No-Regret Learning for Multi-Stage Systems with End-to-End Bandit Feedback

https://doi.org/10.1145/3641512.3686369

Hou, I-Hong (October 2024, ACM)

This paper studies multi-stage systems with end-to-end bandit feedback. In such systems, each job needs to go through multiple stages, each managed by a different agent, before generating an outcome. Each agent can only control its own action and learn the final outcome of the job. It has neither knowledge nor control on actions taken by agents in the next stage. The goal of this paper is to develop distributed online learning algorithms that achieve sublinear regret in adversarial environments. The setting of this paper significantly expands the traditional multi-armed bandit problem, which considers only one agent and one stage. In addition to the exploration-exploitation dilemma in the traditional multi-armed bandit problem, we show that the consideration of multiple stages introduces a third component, education, where an agent needs to choose its actions to facilitate the learning of agents in the next stage. To solve this newly introduced exploration-exploitation-education trilemma, we propose a simple distributed online learning algorithm, ϵ-EXP3. We theoretically prove that the ϵ-EXP3 algorithm is a no-regret policy that achieves sublinear regret. Simulation results show that the ϵ-EXP3 algorithm significantly outperforms existing no-regret online learning algorithms for the traditional multi-armed bandit problem.
more » « less
Full Text Available
Deep Index Policy for Multi-Resource Restless Matching Bandit and Its Application in Multi-Channel Scheduling

https://doi.org/10.1145/3641512.3686381

Zamir, Nida; Hou, I-Hong (October 2024, ACM)

Scheduling in multi-channel wireless communication system presents formidable challenges in effectively allocating resources. To address these challenges, we investigate a multi-resource restless matching bandit (MR-RMB) model for heterogeneous resource systems with an objective of maximizing long-term discounted total rewards while respecting resource constraints. We have also generalized to applications beyond multi-channel wireless. We discuss the Max-Weight Index Matching algorithm, which optimizes resource allocation based on learned partial indexes. We have derived the policy gradient theorem for index learning. Our main contribution is the introduction of a new Deep Index Policy (DIP), an online learning algorithm tailored for MR-RMB. DIP learns the partial index by leveraging the policy gradient theorem for restless arms with convoluted and unknown transition kernels of heterogeneous resources. We demonstrate the utility of DIP by evaluating its performance for three different MR-RMB problems. Our simulation results show that DIP indeed learns the partial indexes efficiently.
more » « less
Full Text Available
AoI, Timely-Throughput, and Beyond: A Theory of Second-Order Wireless Network Optimization

https://doi.org/10.1109/TNET.2024.3432655

Guo, Daojing; Nakhleh, Khaled; Hou, I-Hong; Kompella, Sastry; Kam, Clement (December 2024, IEEE/ACM Transactions on Networking)

This paper introduces a new theoretical framework for optimizing second-order behaviors of wireless networks. Unlike existing techniques for network utility maximization, which only consider first-order statistics, this framework models every random process by its mean and temporal variance. The inclusion of temporal variance makes this framework well-suited for modeling Markovian fading wireless channels and emerging network performance metrics such as age-of-information (AoI) and timely-throughput. Using this framework, we sharply characterize the second-order capacity region of wireless access networks. We also propose a simple scheduling policy and prove that it can achieve every interior point in the second-order capacity region. To demonstrate the utility of this framework, we apply it to an unsolved network optimization problem where some clients wish to minimize AoI while others wish to maximize timely-throughput. We show that this framework accurately characterizes AoI and timely-throughput. Moreover, it leads to a tractable scheduling policy that outperforms other existing work.
more » « less
Full Text Available
Learning and Communications Co-Design for Remote Inference Systems: Feature Length Selection and Transmission Scheduling

https://doi.org/10.1109/JSAIT.2023.3322620

Shisher, Md Kamran; Ji, Bo; Hou, I-Hong; Sun, Yin (January 2023, IEEE Journal on Selected Areas in Information Theory)

In this paper, we consider a remote inference system, where a neural network is used to infer a time-varying target (e.g., robot movement), based on features (e.g., video clips) that are progressively received from a sensing node (e.g., a camera). Each feature is a temporal sequence of sensory data. The inference error is determined by (i) the timeliness and (ii) the sequence length of the feature, where we use Age of Information (AoI) as a metric for timeliness. While a longer feature can typically provide better inference performance, it often requires more channel resources for sending the feature. To minimize the time-averaged inference error, we study a learning and communication co-design problem that jointly optimizes feature length selection and transmission scheduling. When there is a single sensor-predictor pair and a single channel, we develop low-complexity optimal co-designs for both the cases of time-invariant and time-variant feature length. When there are multiple sensor-predictor pairs and multiple channels, the co-design problem becomes a restless multi-arm multi-action bandit problem that is PSPACE-hard. For this setting, we design a low-complexity algorithm to solve the problem. Trace-driven evaluations demonstrate the potential of these co-designs to reduce inference error by up to 10000 times.
more » « less
Full Text Available
Scheduling Real-Time Information-Update Flows for the Optimal Confidence in Estimation

https://doi.org/10.1109/JSAC.2021.3065093

Guo, Daojing; Hou, I-Hong (May 2021, IEEE Journal on Selected Areas in Communications)
null (Ed.)
Full Text Available
Optimal Wireless Scheduling for Remote Sensing through Brownian Approximation

https://doi.org/10.1109/INFOCOM42981.2021.9488785

Guo, Daojing; Hsieh, Ping-Chun; Hou, I-Hong (May 2021, IEEE Infocom 2021)
null (Ed.)
This paper studies a remote sensing system where multiple wireless sensors generate possibly noisy information updates of various surveillance fields and delivering these updates to a control center over a wireless network. The control center needs a sufficient number of recently generated information updates to have an accurate estimate of the current system status, which is critical for the control center to make appropriate control decisions. The goal of this work is then to design the optimal policy for scheduling the transmissions of information updates. Through Brownian approximation, we demonstrate that the control center’s ability to make accurate real-time estimates depends on the averages and temporal variances of the delivery processes. We then formulate a constrained optimization problem to find the optimal means and variances. We also develop a simple online scheduling policy that employs the optimal means and variances to achieve the optimal system-wide performance. Simulation results show that our scheduling policy enjoys fast convergence speed and better performance when compared to other state-of-the-art policies.
more » « less
Full Text Available
Joint Index Coding and Incentive Design for Selfish Clients

https://doi.org/10.1109/TCOMM.2021.3049123

Hsu, Yu-Pin; Hou, I-Hong; Sprintson, Alex (April 2021, IEEE Transactions on Communications)
null (Ed.)
Full Text Available
Fresher content or smoother playback?: a brownian-approximation framework for scheduling real-time wireless video streams

https://doi.org/10.1145/3397166.3409121

Hsieh, Ping-Chun; Liu, Xi; Hou, I-Hong (October 2020, ACM MobiHoc 2020)
null (Ed.)
This paper presents a Brownian-approximation framework to optimize the quality of experience (QoE) for real-time video streaming in wireless networks. In real-time video streaming, one major challenge is to tackle the natural tension between the two most critical QoE metrics: playback latency and video interruption. To study this trade-off, we first propose an analytical model that precisely captures all aspects of the playback process of a real-time video stream, including playback latency, video interruptions, and packet dropping. Built on this model, we show that the playback process of a real-time video can be approximated by a two-sided reflected Brownian motion. Through such Brownian approximation, we are able to study the fundamental limits of the two QoE metrics and characterize a necessary and sufficient condition for a set of QoE performance requirements to be feasible. We propose a scheduling policy that satisfies any feasible set of QoE performance requirements and then obtain simple rules on the trade-off between playback latency and the video interrupt rates, in both heavy-traffic and under-loaded regimes. Finally, simulation results verify the accuracy of the proposed approximation and show that the proposed policy outperforms other popular baseline policies.
more » « less
Full Text Available

« Prev Next »

Search for: All records